SYSTEM AND METHOD FOR RISK-ADJUSTING INDICATORS OF ACCESS 
AND UTILIZATION BASED ON METRICS OF DISTANCE AND TIME 

CROSS-REFERENCE TO RELATED APPLICATIONS 
fOOOl] The application claims the benefit of U.S. Provisional Application No. 

60/446,692, filed February 1 1 , 2003. 

STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH OR 

DEVELOPMENT 

[0002] Not applicable. 

TECHNICAL FIELD 

[0003] The present invention relates to a system and method for risk-adjusting 

indicators of access and utilization of health care services based on metrics of distance, 
which may be either in terms of geographic distance or time 

BACKGROUND OF THE INVENTION 
f0004] Prevention is an important role for all health care providers. Providers can 

help individuals stay healthy by preventing disease, and they can prevent complications of 
existing disease by helping patients live with their illnesses. To fulfill this role, however, 
providers need data on the impact of their services and the opportunity to compare these 
data over time or across communities. Local, State, and Federal policymakers also need 
these tools and data to identify potential access or quality-of-care problems related to 
prevention, to plan specific interventions, and to evaluate how well these interventions 
meet the goals of preventing illness and disability. 
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[0005] Quality indicators may be a set of measures that can be used with health 

system encounter data to identify "ambulatory care sensitive conditions" (ACSCs). ACSCs 
are conditions for which good outpatient care can potentially prevent the need for 
hospitalization, or for which early intervention can prevent complications or more severe 
disease. 

[00°6] Even though these indicators are based on hospital inpatient data, they 

provide insight into the quality of the health care system outside the hospital setting. 
Patients with newly diagnosed cancer may have poor survival or quality of life if their 
cancer management (chemotherapy, radiotherapy, etc.) is delayed more than a few weeks 
following diagnosis. Patients with diabetes may be hospitalized for diabetic complications 
if their conditions are not adequately monitored or if they do not receive the patient 
education needed for appropriate self-management. Patients may be hospitalized for 
asthma if primary care providers fail to adhere to practice guidelines or to prescribe 
appropriate treatments. Patients with appendicitis who do not have ready access to surgical 
evaluation may experience delays in receiving needed care, which can result in a life- 
threatening condition — perforated appendix. 

SUMMARY OF THE INVENTION 
[0007] The present invention relates to a system and method for risk-adjusting 

indicators of access and utilization of health care services based on metrics of distance, 
which may be either in terms of geographic distance or time. The risk-adjusted indicators 
are useful for determining the adequacy of access to care services within populations of 
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varying rurality and managing resources related to high-quality provision of care services 
in metropolitan, suburban, and rural areas. 
[0008] Indicators addressed by the present invention include (but are not limited to) 

the following ambulatory care sensitive conditions, which are measured as rates of 
encounter with the health system, regardless, of the point of origination of the episode that 
generates the encounter. 

[0009] 



Bacterial pneumonia 
Dehydration 
Pediatric gastroenteritis 
Urinary tract infection 

Perforated appendix 

Low birth weight 

New cancer mgt delay > 14 

days s/p initial dx or treatment 

Congestive heart failure (CHF) 



Hypertension 
Adult asthma 
Pediatric asthma 

Chronic obstructive pulmonary disease 
(COPD) 

Diabetes short-term complication 
Diabetes long-term complication 
Uncontrolled diabetes 

Lower-extremity amputation among patients 
with diabetes 



[0010] 



Although other factors outside the direct control of the health care system, 
such as poor environmental conditions or lack of patient adherence to treatment 
recommendations, can result in hospitalization, the indicators provide a meaningful starting 
point for assessing quality of health services in the community. Because the risk-adjusted 
indicators are calculated using readily available health system data, they are an easy-to-use 
and inexpensive screening tool. They can be used to provide a window into the 
community — to identify underserved or under-resourced community heath care needs, to 
monitor how well complications from a number of common conditions are being avoided 
in the outpatient setting, and to compare performance of local health care systems across 
communities. 
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[0011] Properly risk-adjusted indicators assess the quality of the health care system 

as a whole, and especially the quality of ambulatory care, in preventing medical 
complications. As a result, these measures are likely to be of the greatest value when 
calculated at the population level and when used by public health groups, data warehousing 
organizations, and other organizations concerned with the health of populations. 

[0012] These indicators serve as a screening tool rather than as definitive measures 

of quality problems. They can provide initial information about potential problems in the 
community that may require further, more in-depth analysis. Policy makers and health care 
providers can use the risk-adjusted indicators to answer questions such as: 

• How does the low birth weight rate in my locale compare with the national 
average? 

• What can the rate of new cancer management encounters exceeding fourteen 
days tell me about the adequacy of oncology care in my community? 

• Does the admission rate for diabetes complications in my community 
suggest a problem in the provision of appropriate outpatient care to this 
population? 

• How does the admission rate for congestive heart failure vary over time and 
from one region of the country to another? 

[0013] Government policy makers and local community organizations can use the 

indicators to assess and improve community health care. In order to do so in a valid and 
reliable manner, the indicators must generally be confirmed to have adequate precision and 
accuracy, and the indicators must be risk-adjusted to correct for variations in age and 
distance from access to care. 

[0014] Access to various types of care services and treatments will vary for people 

living in the same county. Furthermore, a considerable number of care episodes 
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(encounters with the health system) may begin when the person is at work or at locations 
other than their residence. For evaluation and planning purposes, health systems and 
public health services need to be able to measure access to care and quality of availability 
of care regardless of where geographically care episodes begin. The distance index and 
model of the present invention can be used for econometrics and clinical process consulting 
work with health care organizations in various countries and in various regions within any 
country, irrespective of the locale's rurality and regardless of how much of the health care 
provided by institutions in that locale is delivered to persons whose episodes of care 
originated outside the nominal catchment area for that locale's health jurisdiction. The 
distance index set forth in the present invention can utilize distance either measured in 
miles (kilometers) or elapsed-time minutes from the inception of a clinical event or need 
for care, until the provision of care at an appropriate location of service. (The minutes or 
geographical distance are statistical distributions, measurable and aggregated, in preferred 
embodiments, on a monthly or quarterly basis, from cases accruing in each catchment 
area.) 

[° 015 ] A preferred embodiment of the present invention for the United Kingdom 

uses the "Postcode District" (or PD), or, in another preferred embodiment for the United 
States, the present invention uses the 3-digit zip code or county FIPS (Federal Information 
Processing Standards) to identify geographic localities from which the captured cases 
originated. In the preferred embodiment for the U.K., the originating geographic locality is 
not identified with respect to the SHA and Hospital Trust geographic boundaries, which are 
not where the people live nor where the care episodes start out necessarily. The PD is the 
first part of a U.K. Postcode before the space in the Postcode and typically comprises two 
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to four characters. It is used to specify the town or district to which a letter or package is to 
be sent for further sorting. Once the PD is received, the present invention obtains the 
census population and latitude-longitude GIS (Geographic Information System) 
coordinates for the centroid of each PD. 
[0016] As known in the art, distance (or rurality) indexes suffer from three major 

difficulties, with regard to the purpose of risk-adjusting metrics denoting access to health 
services: 

• Failure to accurately and fully represent the continuum from rural to 
suburban to metropolitan, from fewer than one person per square kilometer 
to many hundreds or thousands of persons per square kilometer; from less 
than five minutes to access care to many hours or even days to access care 
for certain specialty services. 

• "Lumping" or assignment of a county-level distance index to all individuals 
living in a particular jurisdiction, which inaccurately represents the fine 
structure of access within the jurisdiction. 

• "Aggregation" and "norming" to macro sociopolitical levels (national or 
other), which obscures detailed small-area variation in access and 
composition of groups under study and prevents interpretation of differences 
among these groups. 



[0017] The distance index described in the present invention avoids these pitfalls. 

First, by using resident-level case data the resulting index differentiates accurately between 
different locales within a county. Only two variables are required, one from each of the 
two following categories: 

• P: county population, or county population density. 

• D: distance in miles or kilometers, or distance in elapsed-minutes. 
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[0018] The method for calculating the distance index described herein further 

provides for automatic calculation of optimal parameters for a power transform, such that 
approximate normality for the purpose of statistical inferencing is achieved. 

[0019] For each care episode and the person or family to which it pertains, a power 

transform is used for both the P-variable of the locale in which the episode originates, and 
for the D-variable. In the present invention, the Box-Cox transform involves iterative 
determination of optimal values for k u the power to which each Dj for the i* care episode 
is raised, and X 2 , the power to which each Pj for the i* county or catchment area is raised. 
The transformation is expressed as: 



Dj l i 

EH = signal) 



Pi = signfo) 



std(£>»l) 
Pj x 2 



std(J*2) 

where std (J^ 2 )isthesamplestandaiddeviatiDnof Pi A 2 and similarly for Dj*! . 

Next, the transformed values are scaled by the standard deviations, resulting in 
standardized values: 

Pj'signfoJP^ 
where 

. , / +1 if > 0 

[0020] Then the two measures in each distance and population pair are weighted 

and summed to produce an intermediate provisional distance index. The distance metric is 
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given a positive weight to ensure that the index will increase with increasing distance from 
the source of care services. And the population metric is given a negative weight to insure 
that the index will decrease as population or population density increase. Using the 
weighting in the preferred embodiment, the distance index for the i* episode is denoted Ij : 

/i = (j)[^^^o]-[ sistt ^^j] 

[0021] The Ij values are standardized, producing a scaled distance index for the 1 th 

episode: 

d_ epiS ode(i) 

The Anderson-Darling metric is calculated for the distribution of distance values, to assess 
departure from a normal curve, and if the value of A 2 is greater than or equal to A^ a 2 
then the null hypothesis of normality is rejected and values of Xi and X 2 are incremented 
and the loop processing is repeated. Iterations continue until a„ 2 is less than A 2 . 
[<W22] Risk-adjustment of indicator incidence rates may follow any of the methods 

known to those experienced in the art. The risk-adjustment must then be validated 
according to accepted statistical practices before interpretations and conclusions are drawn, 
or before the optimized values for X\ and X 2 are deployed in a public health decision 
support software system. 

I® 023 ! The first step in the validation involves precision tests to determine the 

reliability of the indicator for distinguishing real differences in provider performance. For 
indicators that may be used for quality improvement, it is important to know with what 
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precision, or surety, a measure can be attributed to an actual construct rather than random 
variation. 

[0024] For each indicator, the variance can be broken down into three components: 

variation within a provider (actual differences in performance due to differing patient 
characteristics), variation among providers (actual differences in performance among 
providers), and random variation. An ideal indicator would have a substantial amount of 
the variance explained by between-area or between-provider variance, possibly resulting 
from differences in quality of care access, and a minimum amount of random variation. In 
the preferred embodiment, four tests of precision are used to estimate the magnitude of 
between-provider variance on each indicator: 

• Signal standard deviation is used to measure the extent to which 
performance of the indicator varies systematically across hospitals or areas. 

• Provider/area variation share is used to calculate the percentage of signal (or 
true) variance relative to the total variance of the indicator. 

• Signal-to-noise ratio is used to measure the percentage of the apparent 
variation in indicators across providers that is truly related to systematic 
differences across providers and not random variations (noise) from year to 
year. 

• In-sample R-squared is used to identify the incremental benefit of applying 
multivariate signal extraction methods for identifying additional signal on 
top of the signal-to-noise ratio. 

[0025] In general, random variation is most problematic when there are relatively 

few observations per provider, when adverse outcome rates are relatively low, and when 

providers have little control over patient outcomes or variation in important processes of 

care is minimal. If a large number of patient factors that are difficult to observe influence 

whether or not a patient has an adverse outcome, it may be difficult to separate the "quality 
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signal" from the noise in which it is embedded. Two techniques are applied to improve the 
precision of an indicator: 



• Univariate methods are used to estimate the "true" quality signal of an 
indicator based on information from the specific indicator and one year of 
data. 

• Multivariate signal extraction (MSX) methods are used to estimate the 
"true" quality signal based on information from a set of indicators and 
multiple years of data. In most cases, MSX methods extracted additional 
signal, which provided much more precise estimates of true hospital or area 
quality. 



[0026] To determine the sensitivity of potential indicators to bias from differences 

in patient severity, unadjusted performance measures for specific hospitals were compared 
with performance measures that had been adjusted for age and dcat with dcat derived, from 
transformed distance between the origin location of the episode and the care service venue 
4 where the episode was consummated, or from which resources were dispatched in the case 
of patients treated in situ. Five empirical tests were performed to investigate the degree of 
bias in an indicator: 



• Rank correlation coefficient of the area or hospital with (and without) risk 
adjustment — gives the overall impact of risk adjustment on relative provider 
or area performance. 

• Average absolute value of change relative to mean — highlights the amount 
of absolute change in performance, without reference to other providers' 
performance. 

• Percentage of highly ranked hospitals that remain in high decile— reports 
the percentage of hospitals or areas that are in the highest deciles: without 
risk adjustment that remain there after risk adjustment is performed. 

• Percentage of lowly ranked hospitals that remain in low decile— reports the 
percentage of hospitals or areas that are in the lowest deciles without risk 
adjustment that remain there after risk adjustment is performed. 
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• Percentage that change more than two deciles — identifies the percentage of 
hospitals whose relative rank changes by a substantial percentage (more 
than 20%) with and without risk adjustment. 

[0027] Despite the unique strengths of the indicators, there are several issues mat 

should be considered when using these indicators. First, for some indicators, differences in 

socioeconomic status have been shown to explain a substantial part — perhaps most of the 

variation in indicator rates across areas. The complexity of the relationship between 
socioeconomic status and indicator rates makes it difficult to delineate how much of the 
observed relationships are due to true access to care difficulties in potentially underserved 
populations, or due to other patient characteristics, unrelated to quality of care, that vary 
systematically by socioeconomic status. For some of the indicators, patient preferences and 
hospital capabilities for inpatient or outpatient care might explain variations in 
hospitalizations. In addition, environmental conditions that are not under the direct control 
of the health care system can substantially influence some of the indicators. For example, 
the Chronis Obstruction Pulmonary Disease (COPD) and asthma admission rates are likely 
to be higher in areas with poor air quality. 

[°° 28 1 Second, the evidence related to potentially avoidable hospital admissions is 

limited for each indicator, because many indicators have been developed as parts of sets. 
Only a few studies have attempted to validate individual indicators rather than whole 
measure sets. Weissman JS. Rates of avoidable hospitalization by insurance status. JAMA. 
1992;268:2388-94; Bindman AB. Preventable hospitalizations and access to healthcare. 
JAMA. 1995;274:305-11; Silver MP. Ambulatory care sensitive hospitalization rates in the 
aged Medicare population in Utah: a rural-urban comparison. J Rural Health. 1997; 
13:285-94. 
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[0029] "Raw" unadjusted measures of hospital or area performance for each 

indicator are simple means constructed from the encounter data and census population 
counts. Simple means do not account for differences in the indicators that are attributable 
to differences in patient mix across hospitals that are measured in the encounter data, or 
demographic differences across areas. In general, risk adjustment involves conducting a 
multivariate regression to adjust expected performance for these measured patient and 
population characteristics. Although complex, multivariate regression methods are the 
standard technique for risk-adjustment because they permit the simultaneous consideration 
of multiple patient characteristics and interaction among those characteristics. The 
interpretation of the risk-adjusted estimate is straightforward: it is the value of the indicator 
expected at that hospital if the hospital had an "average" patient case-mix. 

[0030] Empirical performance: discrimination. A critical aspect of the performance 

of a risk-adjustment model is the extent to which the model predicts a higher probability of 
an event for patients who actually experience the event. The statistical test of 
discrimination is generally expressed as a C-statistic or R 2 (how much of the variation in 
the patient level data the model explains). In general, systems that discriminate more have 
the potential to influence indicator measures more substantially. Many severity-adjustment 
systems were designed primarily to predict in subsequent periods (e.g., resource 
consumption next year). However, for purposes of evaluating access indicator 
performance, the estimation of concurrent risk is more important (i.e., differences in the 
likelihood of a beneficiary's obtaining access and appropriately utilizing services to which 
she/he is eligible in the current time period). Ideally, discrimination is assessed using R 2 or 
other statistic of predicted variation that is computed on a separate data source from the one 
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used to develop the model, and to avoid "over-fitting" (i.e., the model might appear to do 
well in part because it explains nonsystematic variations in the data used to develop the 
model). 

[0031] Calibration is also an important component of validation. Calibration is a 

measure of whether the mean of the predicted outcomes equals the mean of the actual 
outcomes for the entire population and for population subgroups. The statistical test is often 
expressed as a Chi-square or "goodness-of-fit" for the equivalence, of means of population 
subgroups. Even if the severity-adjustment system does not predict well at the level of 
individuals, it may predict well at the aggregate (group) level of, say, women, 70-74 years 
of age. Over-fitting is an issue as well, unless a different data source is used to validate the 
model than was used to estimate the model. 

[0032] Risk-adjustment is implemented using patient care episode demographics 

(age and dcat). Then statistical methods are used to account for additional sources of noise 
and bias not accounted for by observed patient characteristics. By applying these methods 
to the indicators, the relative importance of both risk adjustment and smoothing can be 
evaluated in terms of the relative performance of hospitals (or areas) compared to the "raw" 
unadjusted indicators based on simple means from encounter data. In general, simple 
means fail to account both for differences in the indicators that are attributable to 
systematic differences in measured and unmeasured patient mix across hospitals/areas that 
are measured in the discharge data, and for random variations in patient mix. A 
multivariate regression approach adjusts performance measures for measured differences in 
patient mix and permits the inclusion of multiple patient demographic and severity 
characteristics. 
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[0033] Specifically, if it is denoted whether or not the event associated with a 

particular indicator Y k (k=l,...,K) was observed for a particular patient i at hospital/area j 
(j=l,...,J) in year t (t=l,...,T), then the regression to construct a risk-adjusted "raw" 
estimate a hospital or area's performance on each indicator can be written as: 

(1) Y k ijt = M k jt + Zij t n k t + 6 k i ft , where 

k th 

Y ij t is the k quality indicator for patient i discharged from hospital/area j in year t 
(i.e., whether or not the event associated with the indicator occurred on that 
discharge); 

M k t is the "raw" adjusted measure for indicator k for hospital/area j in year t (i.e., 
the hospital/area "fixed effect" in the patient-level regression); 

Zjjt is a vector of patient covariates for patient i discharged from hospital/area j in 
year t (i.e., the patient-level measures used as risk adjusters); 

n k t is a vector of parameters in each year t, giving the effect of each patient risk 
adjuster on indicator k (i.e., the magnitude of the risk adjustment associated with 
each patient measure); and 

6 yt is the unexplained residual in this patient-level model. 

[0034] The hospital or area specific intercept M k t is the "raw" adjusted measure of 

a hospital or area's performance on the indicator, holding patient covariates constant. In 
most of the empirical analysis that follows, the patient-level analysis is conducted using 
data from all hospitals and areas. (The model shown implies that each hospital or area has 
data for all years, and with each year has data on all outcomes; however, this is not 
essential to apply risk adjustment methods.) 

[0035] These patient-level regressions are estimated by linear ordinary least- 

squares (OLS). In general, the dependent variables in the regressions are dichotomous, 
which raises the question of whether a method for binary dependent variables such as logit 
or probit estimation might be more appropriate. OLS regression has been successfully used 
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for similar analyses of hospital/area differences in outcomes. In addition, estimating logit 
or probit models with hospital or area fixed effects cannot be done with standard methods; 
it requires computationally intensive conditional maximum likelihood methods that are not 
easily extended to multiple years and multiple measures. 
[0036] A commonly used solution to this problem is to estimate a logit model 

without hospital or area effects, and then to use the resulting predictions as estimates of the 
expected indicator. However, this method yields biased estimates and predictions of health 
system performance. In contrast, it is easy to incorporate hospital or area fixed effects into 
OLS regression analysis. The resulting estimates are not biased, and the hospital or area 
fixed effects provide direct and easily-interpretable estimates of the outcome rate for a 
particular hospital or area measure in a particular year, holding constant all observed 
patient characteristics. 

[0037] A potential limitation of the OLS approach is that it may yield biased 

estimates of confidence intervals, because the errors of a linear probability model are 
necessarily heteroskedastic. Given the large sample sizes for the parameters estimated 
from these regressions (most indicators involve thousands of "denominator" encounters per 
year), such statistical efficiency is not likely to be an important concern. Nevertheless, 
models are estimated using Weighted Least Squares to account for heteroskedasticity, in a 
manner familiar to those skilled in the art, to see if estimates were affected. Very similar 
estimates of adjusted indicator performance were obtained. 

[0038] Specifically, in addition to age, distance category, and age*dcat interactions 

as adjusters, the model also included. For each hospital, a vector of K adjusted indicator 
estimates is observed over T years from estimating the patient-level regressions (1) run 
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separately by year for each indicator. Each indicator is a noisy estimate of true health 
system quality in each area. 

[0039] In particular, let Mj be the lxTK vector of estimated indicator performance 

for hospital j. Then: 

(2) Mj = nj + £j 

Where Uj is a lxTK vector of the true hospital intercepts for hospital j, and £j is the 
estimation error (which has a mean zero and is uncorrelated with Uj). Note that the variance 
of ej can be estimated from the patient-level regressions, since this is simply the variance of 
the regression estimates Mj. In particular, E( Sj t ' e jt ) = Q jt and E( e jt ' e js ) = 0 for t □ s, 
where £2 jt is the covariance matrix of the intercept estimates for hospital j in year t. 
[0040] A linear combination of each hospital's observed indicators must be created 

in such a way that it minimizes the mean-squared prediction error. Thus,, the following 

regression is run: 



< 3 > u^MjPWj, 

but cannot be run directly, since u is unobserved and the optimal p varies by hospital and 
year. While equation (3) cannot be directly estimated, it is possible to estimate the 
parameters for this hypothetical regression. In general, the minimum mean squared error 
linear predictor of u is given by Mj p , where p = [ECMj'Mj)] 1 E(Mj'uj). This best linear 
predictor depends on two moment matrices: 

( 4 - !) E(Mj'Mj) = E(uj' Uj) + E(Ej' gj) 
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(4.2) E(M j ^j) = E(^ l i j ) 

The required moment matrices are estimated directly as follows: 

♦ Estimate E(8j ' 8j) with the patient-level OLS estimate of the covariance 
matrix for the parameter estimates Mj. Call this estimate Sj. Note that Sj 
varies across hospitals. 

. Estimate E(|Xj' uj) by noting that E(Mj'Mj - Sj) = E(|ij' nj). If we assume 
that E(nj' nj) is the same for all hospitals, then it can be estimated by the 
sample average of Mj'Mj - Sj. Note that it is easy to relax the assumption 
that E(uj' nj) is the same for all hospitals by calculating Mj'Mj - Sj for 
subgroups of hospitals. 

[0041] With estimates of E(uj' Uj) and E(ej' 8j), one can form least squares 

estimates of the parameters in equation (3) which minimize the mean squared error. 
Analogous to simple regression, the prediction of a hospital's true intercepts is given by: 

MjB (M/ My)" 1 E (M/ fij) = Mj [E (p/ fij) + E (e/ ej) ] - 1 E (ji/fi;) 

(5) 

using estimates of E(uj' Uj) and E(ej' 8j) in place of their true values. One can use the 
estimated moments to calculate other statistics of interest as well, such as the standard error 
of the prediction and the r-squared for equation (3), based on the usual least squares 
formulas. Estimates based on equation (5) are referred to as "filtered" estimates, since the 
key advantage of such estimates is that they optimally filter out the estimation error in the 
raw quality indicators. 

[0042] Equation (5) in combination with estimates of the required moment matrices 

provides the basis for estimates of hospital quality or health service area quality with regard 
to care access. Such estimates of hospital quality have a number of attractive properties. 
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First, they incorporate information in a systematic way from many outcome indicators and 
many years into the predictions of any one outcome. Moreover, if the moment matrices 
were known, the estimates of hospital quality represent the optimal linear predictors, based 
on a mean squared error criterion. Finally, these estimates maintain many pf the attractive 
aspects of existing Bayesian approaches, while dramatically simplifying the complexity of 
the estimation. It is possible to construct univariate smoothed estimates of hospital quality, 
based only on empirical estimates for particular measures, using the models just described 
but restricting the dimension of Mj to only a particular indicator k and time period t. Of 
course, to the extent that the provider indicators are correlated with each other and over 
time, this will result in a less precise estimate. 
[0043] With the system and method applied over time with multiple years of data 

accruing longitudinally, it is advantageous to impose structure on ECfij^j) for two reasons. 
First, this improves the precision of the estimated moments by limiting, the number of 
parameters that need to be estimated. Second, a time series structure allows for out-of- 
sample forecasts. A non-stationary, first-order Vector Autoregression structure (VAR) is 
used. The VAR model is a generalization of the usual autoregressive model, and assumes 
that each hospital's quality indicators in a given year depend on the hospital's quality 
indicators in past years plus a contemporaneous shock that may be correlated across quality 
indicators. In most of what follows, a non-stationary first-order VAR is assumed for pj t 
(IxK), where: 

( 6 ) Pjt = Hj.t-iO + u jt , with V(Ujt) = E and V( W ,) = T . 

[0044] Thus, estimates are needed of the lag coefficient (<t>), the variance matrix of 

the innovations (E), and the initial variance condition (T), where £ and T are symmetric 
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KxK matrices of parameters and O is a general KxK matrix of parameters, for a total of 
2K +K parameters. For example, ten parameters must be estimated for a VAR model with 
two outcomes (K=2). 

[0045] The VAR structure implies that E(Mj'Mj - Sj) = E(pj>j) = f(0,S,r). Thus, 

the VAR parameters can be estimated by Optimal Minimum Distance (OMD) methods, 
i.e., by choosing the VAR parameters so that the theoretical moment matrix, f(0,E,r), is as 
close as possible to the corresponding sample moments from the sample average of Mj'Mj 
- Sj. More specifically, let dj be a vector of the non-redundant (lower triangular) elements 
of Mj'Mj - Sj, and let 8 be a vector of the corresponding moments from the true moment 
matrix, so that 5=g(0,2,r). Then the OMD estimates of (0,2,0 minimize the following 
OMD objective function: 



(7) 

where V is the sample estimate of the covariance matrix for d, and D is the sample average 
of d. If the VAR model is correct, the value of the objective function, q, will be distributed 
X 2 (p) where p is the degree of over-identification (the difference between the number of 
elements in d and the number of parameters being estimated). Thus, q provides a goodness 
of fit statistic that indicates how well the VAR model fits the actual coyariances in the data. 
[0046] Finally, estimated R 2 statistics are used to evaluate the filtered estimates' 

ability to predict (in sample) and forecast (out-of-sample) variation in the true intercepts, 
and to compare methods used to conventional methods (e.g., simple averages, or univariate 



CRNC.2003.007 



19 



o n 

shrinkage estimators). If true hospital intercepts (u.) were observed, a natural metric for 
evaluating the predictions would be the sample R-squared: 

(8) 

where 

"y = /*/ - h 

is the prediction error. Of course \i is not observed. Therefore, an estimate is constructed 
using the estimate of E(uj' uj) for the denominator, and the estimate of 

for the terms in the numerator. Finally, a weighted R-squared is calculated (weighting by 
the number of patients treated by each hospital). 
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[0058] In accordance with the invention, a method and system mitigating the 

limitations enumerated above and suitable for a risk-adjustment procedure for correcting 
reported rates of health care utilization or access indicators are provided. The invention is 
intended to be used by health care organizations in monitoring and undertaking steps to 
correct or improve service delivery, or by units of government seeking to evaluate access. 

[0059] Additional advantages and novel features of the invention will be set forth in 

part in a description which follows, and in part will become apparent to those skilled in the 
art upon examination of the following, or may be learned by practice of the invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0060] The present invention is described in detail below with reference to the 

attached drawing figures, wherein: 
[0061] FIG 1 is a flow chart illustrating a preferred method for developing, 

optimizing, and validating the locally normed transformed distances and populations, using 
the Anderson-Darling metric as a stopping criterion (alpha may in the preferred 
embodiment be selected by the user, but in most cases it will be p=0.05); 
[0062] FIG. 2 is a flow chart illustrating an exemplary embodiment of a plurality of 

possible risk-adjustment embodiments, implementing the said method of FIG. 1 

DETAILED DESCRIPTION OF THE INVENTION 
[0063] Referring now to FIG 1, a diagram is shown of the elements comprising the 

method and system for generating the locally normed distance index and verifying and 
validating whether such an index achieves adequate goodness of fit in the intended 
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geographic region of deployment, sufficient for satisfactory performance in the use for 
risk-adjusting indicators of access to and utilization of health services. 

[0064] Referring now to FIG. 2, a diagram is shown of the elements comprising the 

method and system for applying the locally normed distance index, stratified into a finite 
number of categories, to risk-adjust the incidence rates for access-related utilization 
indicators. The data element HOSPSTCO provides flexibility to calculate the indicators by 
hospital location or by patient residence. If the user wants to calculate the indicators using 
the population associated with the hospital location as the denominator, the values for this 
variable should be the individual hospital FTPS state/county codes. Calculating indicators 
based on the population of the MSA region or county associated with inception of each 
care episode, which may or may not be the locale in which the patient resides, the values 
for this variable should be the FIPS state/county code or PD associated with each 
individual location where a care episode commences. 

[0065] If the hospital FIPS code is used in HOSPSTCO, rates may be biased for 

hospitals, which serve as regional referral centers. These hospitals are more likely to treat = 
patients from outside the MSA, county or even the state in which the facility is located 
compared to hospitals that are not tertiary centers. Therefore, using the care episode 
origination FIPS state/county code for analysis more accurately reflects the true population 
at risk. Evaluation of geographic variations in admissions for ambulatory care sensitive 
conditions by episode FIPS or postcode district or zip code can result in better information 
to guide community or provider response. 
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[0066] It is possible that some records in the input data file may be missing the 

patient FIPS code. Any records with missing values in the HOSPSTCO data field are 
excluded from the calculations of observed, risk-adjusted and smoothed indicator rates. 

[0067] A preferred embodiment of the present invention in SAS source code format 

and a sample data set are attached hereto as an exemplary means of implementing the . 
present invention. 

[0068] Although the invention has been described with reference to the preferred 

embodiment illustrated in the attached drawing figures, it is noted that substitutions may be 
made and equivalents employed herein without departing from the scope of the invention 
as recited in the claims. For example, additional steps may be added and steps omitted 
without departing from the scope of this invention. 
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